Metadata

Close
Metadata

%0 Conference Proceedings
%4 sid.inpe.br/sibgrapi/2019/09.09.23.37
%2 sid.inpe.br/sibgrapi/2019/09.09.23.37.24
%@doi 10.1109/SIBGRAPI.2019.00012
%T CV-C3D: Action Recognition on Compressed Videos with Convolutional 3D Networks
%D 2019
%A Santos, Samuel Felipe dos,
%A Sebe, Nicu,
%A Almeida, Jurandy,
%@affiliation Universidade Federal de São Paulo - UNIFESP, Brazil
%@affiliation University of Trento - UniTn, Italy
%@affiliation Universidade Federal de São Paulo - UNIFESP, Brazil
%E Oliveira, Luciano Rebouças de,
%E Sarder, Pinaki,
%E Lage, Marcos,
%E Sadlo, Filip,
%B Conference on Graphics, Patterns and Images, 32 (SIBGRAPI)
%C Rio de Janeiro, RJ, Brazil
%8 28-31 Oct. 2019
%I IEEE Computer Society
%J Los Alamitos
%S Proceedings
%K computer vision, action recognition, deep learning, compressed domain, efficiency.
%X Action recognition in videos has gained substantial attention from the computer vision community due to the wide range of possible applications. Recent works have addressed this problem with deep learning methods. The main limitation of existing approaches is their difficulty to learn temporal dynamics due to the high computational load demanded for processing huge amounts of data required to train a model. To overcome this problem, we propose a Compressed Video Convolutional 3D network (CV-C3D). It exploits information from the compressed representation of a video in order to avoid the high computational cost for fully decoding the video stream. The speed up of the computation enables our network to use 3D convolutions for capturing the temporal context efficiently. Our network has the lowest computational complexity among all the compared approaches. Results of our approach in the task of action recognition on two public benchmarks, UCF-101 and HMDB-51, were comparable to the baselines, with the advantage of running at faster inference speed.
%@language en
%3 118paper.pdf